Early Experience on Using Knights Landing Processors for Lattice Boltzmann Applications
نویسندگان
چکیده
Knights Landing (KNL) is the second generation of Intel processors based on Many Integrated Cores (MIC) architecture targeting HPC application segment. It delivers massive thread and data parallelism together with high-speed on-chip memory bandwidth in a standalone processor that can boot a off-the-shelf Linux operating system. KNL provides more than 3 TFlops of computing power for double-precision computation, doubling to 6 TFlops for single-precision. In this work we assess the performance of this new processor for Lattice Boltzmann codes widely used in computational fluid-dynamics. We design and implement an OpenMP code, and evaluate the impact of several data memory layouts to meet the different computing requirements of distinct parts of the application, aiming to exploit a large fraction of available peak computing throughput. We also perform a preliminary analysis of energy efficiency, evaluating the time-to-solution and average-power consumption for each memory layout, and make some comparison with other processors and accelerators.
منابع مشابه
Energy-Efficiency Evaluation of Intel KNL for HPC Workloads
Energy consumption is increasingly becoming a limiting factor to the design of faster large-scale parallel systems, and development of energy-efficient and energy-aware applications is today a relevant issue for HPC code-developer communities. In this work we focus on energy performance of the Knights Landing (KNL) Xeon Phi, the latest many-core architecture processor introduced by Intel for th...
متن کاملExternal and Internal Incompressible Viscous Flows Computation using Taylor Series Expansion and Least Square based Lattice Boltzmann Method
The lattice Boltzmann method (LBM) has recently become an alternative and promising computational fluid dynamics approach for simulating complex fluid flows. Despite its enormous success in many practical applications, the standard LBM is restricted to the lattice uniformity in the physical space. This is the main drawback of the standard LBM for flow problems with complex geometry. Several app...
متن کاملPerformance Analysis and Optimization of Parallel Scientific Applications on CMP Clusters
Chip multiprocessors (CMP) are widely used for high performance computing. Further, these CMPs are being configured in a hierarchical manner to compose a node in a cluster system. A major challenge to be addressed is efficient use of such cluster systems for large-scale scientific applications. In this paper, we quantify the performance gap resulting from using different number of processors pe...
متن کاملA Comparative Solution of Natural Convection in an Open Cavity using Different Boundary Conditions via Lattice Boltzmann Method
A Lattice Boltzmann method is applied to demonstrate the comparison results of simulating natural convection in an open end cavity using different hydrodynamic and thermal boundary conditions. The Prandtl number in the present simulation is 0.71, Rayleigh numbers are 104,105 and 106 and viscosities are selected 0.02 and 0.05. On-Grid bounce-back method with first-order accuracy and non-slip met...
متن کاملNext Generation Science Applications for the Next Generation of Supercomputing
The Trinity supercomputer deployment by Los Alamos and Sandia National Laboratories represents the first Advanced Technology System (ATS) deployment for the United States National Nuclear Security Administration (NNSA). The platform will be one of the largest XC40 deployments in the world when final integration of its nearly ten-thousand nodes of dual-socket Intel Xeon Haswell and ten-thousand ...
متن کامل